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Abstract 

Does there exist 0(l)-competitive (self-adjusting) binary search tree (BST) algorithms? This 
is a well-studied problem. A simple offline BST algorithm GreedyFuture was proposed indepen- 
dently by Lucas [S] and Munro [BJ, and they conjectured it to be 0(l)-competitive. Recently, 
Demaine et al. [5] gave a geometric view of the BST problem. This view allowed them to 
give an online algorithm Greedy Arb with the same cost as GreedyFuture. However, no o(n)- 
competitive ratio was known for Greedy Arb. In this paper we make progress towards proving 
0(l)-competitive ratio for Greedy Arb by showing that it is 0(log n)-competitive. 

1 Introduction 

Binary search trees (BST) are a data structure for the dictionary problem. There are many examples 
of (static or offline) binary search trees, e.g. AVL trees and red-black trees, which take O(logra) 
worst-case time per search query (and possibly for other types of operations such as insert/delete; 
but we will confine ourselves to search queries in this note) on n keys. For static trees, O(logre) 
bound cannot be improved. But trees that can change shape in response to queries can potentially 
have smaller amortized search time per query. In this case, competitive analysis is used to measure 
the performance. 

Splay trees [9] of Sleator and Tarjan are simple self-adjusting binary search trees which are 
0(logn)-competitive. Sleator and Tarjan [9] conjectured that Splay trees were in fact O(l)- 
competitive. Unfortunately, despite considerable efforts o(log ^-competitiveness for Splay trees 
is not known; see, e.g., [8] for a recent discussion. But even a potentially easier question remains 
open: Is there any BST algorithm which is 0(l)-competitive? In the past decade progress was 
made on this question and BST algorithms with better competitive ratio were discovered: Tango 
trees [3] were the first 0(loglogn)-competitive BSTs; Multi-Splay trees [10] and Zipper trees pQ 
also have the same competitive ratio along with some additional properties. Analyses of these 
trees use lower bound for the time taken to complete a sequence of requests. Wilber pT] gave 
two different such lower bounds. Wilber 's first lower bound is used in [3], [TO] and [1] to obtain 
O(loglogn) competitiveness. These techniques based on Wilber's bound have so far failed to give 
o(log log ^-competitiveness. 

Even the off-line problem is not well-understood, and the best performance guarantee known is 
the same as the online guarantee obtained by Tango trees just mentioned. Lucas [5] and Munro [6], 
independently proposed an offline BST algorithm GreedyFuture (this name comes from [2]) and 
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they conjectured that GreedyFuture was 0(l)-competitive. But even o(n)-competitiveness was 
not known for GreedyFuture. Our main result, Theorem 11.11 below, implies that GreedyFuture is 
0(log n) -competitive, thus making progress towards 0(l)-competitive ratio for GreedyFuture. 

A new line of attack on the BST problem was given by Demaine et al. [2]. They gave a geometric 
view of the problem of designing BST algorithms. This allowed them to translate conjectures and 
results about BSTs into intuitively appealing geometric statements. We now quickly describe their 
geometric view. 

1.1 Problem Definition 

In the BST problem we want to maintain n keys in a binary search tree to serve search requests. 
In response to each search request, the algorithm is allowed to modify the structure of the tree, but 
this change in the structure adds to the search cost. We are interested in designing BST algorithms 
with small amortized search cost, in other words, algorithms with small competitive ratio. Since 
we will only work in the geometric view, we define the BST problem formally only in the geometric 
model and omit the formal definition of the BST model; please see [2] for details. 

In the geometric view of the BST problem, we will work in the two-dimensional plane with a 
fixed cartesian coordinate system. We have n keys in the tree, which we will assume to be 1, 2, ... , n. 
And we have n search queries coming at time instants 1,2, ... ,n, one for each key (this, of course, 
is not the general situation as a key could be searched more than once, but as discussed in [2], we 
can work with this case without loss of generality). We can represent these n queries as points 
in the plane: Let X = {p%,P2, ■ ■ ■ ,Pn} be a set of n points in the two-dimensional plane define as 
follows. Let the x-axis represent the key space and y-axis represent time. Each pi is represented by 
a pair where both i and ti are integers. We say that the key i arrive at time i,. For a point 

p, let p.x denote its x-coordinate and p.y denote its y-coordinate. Clearly for all distinct p,q 6 X 
we have p.x ^ q.x, p.y ^ q.y. That is to say, there exists exactly one point from X on line x = i, 
for 1 < i < n. Similarly, there exists exactly one point from X on line y = i, for 1 < i < n. For a 
pair of points p, q not on the same horizontal or vertical line, the axis-aligned rectangle formed by 
p and q is denoted by dpq 

Definition 1.1 (|2j). A pair of points (p,q) is said to be arborally satisfied with respect to a point 
set P, if (1) p and q lie on the same horizontal and vertical line, or (2) 3r G P \ {p, q} such that r 
lies inside or on the boundary oflDpq. A point set P is arborally satisfied if all pairs of points in 
P are arborally satisfied with respect to P. 

Arborally Satisfied Set (ArbSS) Problem: Given a point set X, find a minimum cardi- 
nality point set Y such that XL)Y is arborally satisfied. [While several definitions here are from [2], 
we have chosen to use less colorful abbreviations than in [2].] Let MinArb(A) denote the minimum 
cardinality point set which solves the ArbSS problem on a point set X. [2j shows that the BST 
view and the geometric view are essentially equivalent. In particular, if OPT(S') is the minimum 
cost of computing a request sequence S, then MinArb(A) = 0(OPT(S')), where X is the set of 
points in the plane corresponding to S. 

1.2 GreedyArb Algorithm 

There is a natural greedy algorithm for the ArbSS problem: 
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Figure 1: The red point are the point set X and the blue points are added by GreedyArb Algorithm 



Sweep the point set X with a horizontal line by increasing the y-coordinates. Let the 
point p be processed at time p.y. At time p.y, place the minimal number of points on 
line y = p.y to satisfy the rectangles with p as one endpoint and other endpoint in XUY 
with y-coordinate less than p.y. This minimal set of point M p is uniquely defined: for 
any unsatisfied rectangle formed with (s,p.y) as one of the corner point, add a point at 
(s,P-y)- 

GreedyArb is an online algorithm in the sense that at each time instant it adds a set of points 
so that the resulting set is arborally satisfied, and it does so without the knowledge of the fu- 
ture requests. [2] shows that GreedyArb can be used to derive an online BST algorithm with 
the same competitive ratio. It is conjectured that GreedyArb is 0(l)-competitive, that is, the 
number of points added by the GreedyArb algorithm on a point set X is 0(\MinArb(X)\). Sur- 
prisingly, GreedyArb has the same competitive ratio as the aforementioned (offline) BST algorithm 
GreedyFuture [2J. To our knowledge, no non-trivial (i.e. o(n)) competitive factor was known for 
GreedyFuture or GreedyArb. In this paper we prove the following result: 

Theorem 1.1. GreedyArb is O (log n)- competitive. 

Our analysis of GreedyArb exposes some interesting combinatorial properties of GreedyArb 
algorithm, which may be useful in further improving the competitive ratio. [After we had written 
our results, it came to our attention that Patrascu in his talk slides [7] has claimed that with 
Iacono he has proved that GreedyArb is O (log ^-competitive. To our knowledge, this result has 
not appeared anywhere, and our work was done independently] 



2 Proof of the main result 

In this section we prove Theorem 11.11 Let us first outline our approach. 

Let X be the input set of n points that we want to arborally satisfy by adding more points. 
We want to prove that GreedyArb adds O(nlogn) points. We prove this by using the standard 
recurrence T(n) = 2T(n/2) + 0(n), where T(n) is the maximum possible number of points added by 
GreedyArb on sets of n points. We interpret the previous equation as follows: Divide the n points 
into two equal sets P (points with x-coordinate in {1, 2, . . . , n/2}) and Q (rest of the points). For 
sets P and Q we define regions of the plane Rp and Rq in the natural way: R p = {r | 1/2 < r.x < 
n/2 + 1/2}, and similarly Rq = {r j n/2 + 1/2 < r.x < n + 1/2}. We show that the total number 
of points added by GreedyArb in R p when processing points in Q is 0(n), and by symmetry, the 
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Figure 2: p%, p 2 and p% are corner points in P for Q at time t 

number of points added in R q when processing points in P is 0(n). This gives the recurrence above 
for the top level. But we have to prove that the property holds more generally for our recursion 
to hold at all levels. In general, we show the following: Given any set of 2k consecutive keys, 
with P consisting of the points corresponding to the first k keys and Q consisting of the points 
corresponding to the last k keys. Then the total number of points added in Rq by Greedy Arb when 
processing points in P is 0(k), and similarly, the number of points added in Rp when processing 
points in Q is 0(k). The rest of the proof is devoted to showing this last statement. Once we prove 
this, we get the above recurrence and that immediately completes the proof of Theorem II .li 

We now proceed with the formal proof. 

Let S = {pj,pj + i, . . . ,Pj+2fc-i} be a set of 2k consecutive points in X such that pi+\.x = pi.x + 1 
Vj < i < j + 2k - 1. Let P = {pj,p j+1 , . . . ,p j+k ~i} and Q = {p j+k ,p j+k+1 , . . . ,p j+2 k-i}- Define 
Rp to be the region between the vertical lines passing through points j — 1/2 and (j + k — 1) + 1/2; 
similarly, Rq is the region between the vertical lines passing through the points (j + k — 1) — 1/2 
and (j + 2k - 1) + 1/2. Let P = {pi,p 2 , . . . ,Pj-i} and Q r = {p j+2k ,p j+2k+1 , ■ ■ ■ ,Pn}- Let R Pl be 
the region to the left of Rp such that it contains all the point Pj. Similarly, Rq t is the region to 
the right of Rq and it contains all the points in Q r . Let X<i denote the set of all the point in X 
which arrive before time i. Let M p be the points added by GreedyArb while processing point p. 
Let Mp = {m € M p | m lies in region P}. 

Definition 2.1. Let Z^ t = {p G X<t \ p lies in region Rp} U {m G M p | p G X and p.y < t and 

point m lies in the region Rp}. A point q G Z< t is said to be a corner point in P for Q at time t 
if there is no point q' G Z^ t \ {q} such that q'.x > q.x and q'.y > q.y. A point q G Z< t is said to 
be a corner point in P for Pi at time t if there is no point q' G Z< t \ {q} such that q'.x < q.x and 
q'.y > Q.y. Let Ct be the set of corner points in P for Q at time t. 

Lemma 2.1. Let p G X be the point arriving at time t. If p G P, then |Cf+i| < \Ct\ + 1- 

Proof. is the set of points added by GreedyArb in region Rp at time t. Let r G (Mp U {p}) 
be a point such that for all r' G (Mp U {p}), r.x > r'.x. That is, r is the rightmost point in the set 
Mp U {p}. No point in (Mp U {p}) \ {r} can be a corner point for Q at time t because each one of 
them has point r to their right in Rp. Only point r can potentially be a corner point among the 
points in M p U {p}. So the number of corner points can increase by at most one at time t + 1. □ 

Lemma 2.2. Let p G X be the point arriving at time t. If p G Q U Q r , then (1) if \M p \ = 0, then 
\C t+1 \ = \C t \; and (2) if |M„ P | > 0, then \C t+l \ < \C t \ - (|M„ P | - 1). 
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Proof. If \Mp | = 0, then the number of corner points cannot increase as there are no points on 
line y = t in region Rp. So let \Mp\ > 0. The execution of Greedy Arb tries to satisfy all the 
unsatisfied rectangles with one corner p. For each marked point in Mp, the corresponding other 
corner of the rectangle must lie in Ct- Let iVjf = {q \ q G Ct and Greedy Arb adds a point r G Mp 
due to unsatisfied rectangle Oqp}. At time t + 1, all the points in iVjf cease to remain the corner 
point, because Vq G iVjf 3r G .Mjf such that r.x = g.x and r.y > g.y. So the decrease in the number 
of corner points is at least \Np \ = \Mp\. As in the proof of Lemma \2.1\ only one point in Mp can 
become a corner point for Q at time t + 1. So Ct+\ < Ct — (jMjf [ — 1). □ 

Assume that P = 0. Let p be the last point to arrive in set Q. Applying Lemmas 12.11 and 12,21 
inductively we get, \C p , y \ < \C\\ + 1 — (l-^f I — !)• Since |Ci| = this gives, 

qeP qeQUQr 

QM[\-l)<-\C p .y\ + \P\, 

qeQUQr 

this gives 

E (Ki-^^-ic^i+m 

and so 

E |A^ P |<I^I+ E 1 < l p l + IQI < 2fc - 

<?e<2 geQ 

So if P/ = 0, then the number of points added while processing points in Q in region Rp is 0(k). 
But if Pi ^ 0, then the quantity X^eQ IM^I ma y ^ e larger. We will argue that it's still 0(k). We 
denote the set of all the points added by GreedyArb as Y. 

Definition 2.2. For a point p G P let T< t 6e a// the points from X U Y on line x = p.x with their 
y-coordinate less than t, where t > p.y. Let q be the point in T^ t with largest y-coordinate. We say 
that point p is hidden at time t if there exist points qi,q r G R p respectively to the left and right of 
q on line y = q.y. If point p is hidden at time t, point q cannot be a corner point for Q at time t. 

Definition 2.3. For a point p G P let r be the point in T< t with the largest y-coordinate. We say 
that the point p is exposed at time t if either there is no point to the left or no point to the right 
of point r on the line y = r.y in region Rp (in other words, at least one (or both) of the two sides 
(left and right) is empty). If p is exposed at time t, then point r may potentially be a corner point 
for Q at time t. 

Lemma 2.3. Let p G P be hidden at time t and let it be exposed for the first time at time t' + 1. 
Let q be the point processed by GreedyArb at time t' , then q G P. 
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Figure 3: At time t , p G X is hidden but q G X is exposed 

Proof. Since p remains hidden at time t' , 3r such that r has the largest y-coordinate in T p <t , and 
there exist points r\ and r r respectively to the left and right of r on line y = r.y in Rp. Also since 
p is exposed for the first time at time t' + 1, there must be a point added by GreedyArb at (p.x, t'). 
But if q G Pi, then the rectangle Orq is already satisfied by r\. Similarly, if q G (Q U Q r ), then the 
rectangle Orq is satisfied by r r . So if q G (P U Q U Q r ), then GreedyArb will not add any point at 
(p.x,t'). So g G P. □ 

Corollary 2.4. ^4 point p G P can oe hidden by any other point q G X , out it can oe exposed by a 
point in P only. Also let p G P 6e hidden at time t and be exposed for the first time at time t' + 1. 
Let q G (P; UQU Qr) with t < q.y < t' , then GreedyArb cannot add any point at (p.x,q.y). 

Definition 2.4. For each q G MF , there exists a unique point r G X (we get the uniqueness 
because of our assumption that on each vertical line X has at most one point) such that r.y < q.y 
and r.x = q.x. Let r be called the parent to q and be denoted as parent{q) . 

Let qi and q r be the points in Mp with the smallest and largest ^-coordinates respectively. By 
definition, for each point q G Mp \ {q r ,qi\, parent(q) either remains hidden or is hidden at time 
p.y + 1. Only parent(qi) and parent(q r ) may become exposed at time p.y + 1. So a point p G P 
can expose at most two points in P. 

Lemma 2.5. The total number of times the points in P change their state from hidden to exposed 
and vice versa is at most 5k. 

Proof. The point p may initially be exposed. Once p is hidden, it remains hidden till a point q G P 
exposes it. Each point q G P can expose at most two points of P. So the total number of points 
that can be exposed is at most 2k. After that, only k points can become hidden again as |P| = k, 
but these points cannot get exposed again. So the number of times points in P change their state 
= 2k (exposed to hidden) +2k (hidden to exposed) +k (exposed to hidden) = 5k. □ 

Remark: This is a very crude analysis of the total number of times the points P can change 
their state. The number of times the state changes is probably at most 2k. 

Lemma 2.6. V] \Mp \ < 7k. 

Proof. Assume for contradiction \Mp\ > 7k. Let the two points in Mp with the smallest and 

peQ 

largest x-coordinates be called extreme points. Let Z be the set of all extreme points in the set 
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U peQ Mp. So |(U pGQ M^) \Z\ > 5k. By Corollary E31 no point can be added below any point 
already hidden while processing p. So for each point q € {VJ p ^qM^)\Z , parent(q) must be exposed 
at time p.y and gets hidden at time p.y + 1. By Lemma 12.51 the number of times the points in 
P can change state from exposed to hidden is at most 5k. So this gives a contradiction. So 

Thus we have shown that the number of points in Rp added by Greedy Arb when processing 
points in Q is 0(k). It follows by symmetry that the number of points in Rq added by Greedy Arb 
when processing points in P is also 0(k). 

Let Tx[i... n ] be the total number of points added by Greedy Arb algorithm. More generally, for 
1 < i < j < n, let Tx[ij] be the maximum possible number of points added by GreedyArb in region 
Rx[i,j\ when processing points in set X[i,j], where Rx[i,j] is the region between the vertical lines 
passing through i — 1/2 and j + 1/2. Here the maximum is taken over all possible sets X of size n, 
satisfying our assumption in Sec. 11.11 (namely, each vertical line has at most one point of X, and 
each horizontal line has at most one point of X). Now we can write the recurrence introduced at 
the beginning of the proof. We divide the 2k points into two set. Set P = X[j . . . k — 1] contains 
the first half and the set Q = X[j + k . . . j + 2k — 1] contains the second half and Rp and Rq are 
the corresponding regions. 

Tx\j...j+2k-i\ = * n e total number of points added when processing points of P in Rq (^ pg p + 
the total number of points added when processing points of Q in Rp (^2 p£ Q \Mp\) + 
the total number of points added when processing points of P in Rp {Tx[j...j+k-i})) + 
the total number of points added when processing points of Q in Rq (Tx\j+k...j+2k-ll) • 

Using Lemma E21 this gives T x y ... j+ 2k-i] = T X [j... j+ k-i]) + T X [ j+ k...j+2k-i] + 0{k). 
Which gives our desired result: ?x[i...n] = 0(n log n). 
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